<!DOCTYPE html>

<html>

<head>

<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta http-equiv="X-UA-Compatible" content="IE=EDGE" />

<meta name="viewport" content="width=device-width, initial-scale=1" />



<title>COVID-19 condition study in the United States</title>

<script src="site_libs/accessible-code-block-0.0.1/empty-anchor.js"></script>
<link href="site_libs/anchor-sections-1.0/anchor-sections.css" rel="stylesheet" />
<script src="site_libs/anchor-sections-1.0/anchor-sections.js"></script>





<link rel="stylesheet" href="report_files/style.css" type="text/css" />





</head>

<body>




<section class="page-header">
<h1 class="title toc-ignore project-name">COVID-19 condition study in the United States</h1>
</section>



<section class="main-content">
<div id="introduction" class="section level2">
<h2>Introduction</h2>
<p>COVID-19 is a global pandemic that affects our health and life. By exploring the COVID-19 condition, we can make take possible actions to better contain its spread and make plans for the future. In this project, we mainly study the COVID-19 condition in the United States.</p>
<p>How is the COVID-19 condition in the United States now? The question can be answered from the following perspectives:</p>
<ul>
<li>Q1: Condition Overview: latest numbers about the tests, confirmed cases, deaths and recoveries.</li>
<li>Q2: Pandemic Tendency: the tendency of COVID-19 infection in terms of new tests, new positives and new deaths.</li>
<li>Q3: Community Infection Status: the infection status of different race/ethnicity and age groups.</li>
<li>Q4: Population Infection Rate: the population infection rate in different states, which can be a indicator of virus spreading level.</li>
<li>Q5: Key Information: the key latest information or notes we should pay attention to.</li>
</ul>
</div>
<div id="methods" class="section level2">
<h2>Methods</h2>
<p>The data for this study comes from three sources:</p>
<ol style="list-style-type: decimal">
<li><a href="https://covidtracking.com/data/api" class="uri">https://covidtracking.com/data/api</a></li>
<li><a href="https://covid.cdc.gov/covid-data-tracker/#demographics" class="uri">https://covid.cdc.gov/covid-data-tracker/#demographics</a></li>
<li><a href="https://github.com/nytimes/covid-19-data" class="uri">https://github.com/nytimes/covid-19-data</a></li>
</ol>
<p>The data are details are described as follows:</p>
<ul>
<li>cases_by_age_group.csv # COVID-19 cases in United States grouped by age bins, last updated at Nov 17 2020 12:18PM.</li>
<li>cases_by_race_ethnicity__all_age_groups.csv # COVID-19 cases in United States grouped by races/ethnicities, last updated at Nov 17 2020 12:18PM.</li>
<li>daily.csv # historical COVID-19 of the United States, daily recorded</li>
<li>deaths_by_age_group.csv # COVID-19 deaths in United States grouped by age bins, last updated at Nov 17 2020 12:18PM.</li>
<li>deaths_by_race_ethnicity__all_age_groups.csv # COVID-19 deaths in United States grouped by races/ethnicities, last updated at Nov 17 2020 12:18PM.</li>
<li>states_current.csv # The most recent COVID data for every state.</li>
<li>states_daily.csv # all COVID data available for every state since tracking started.</li>
<li>states_info.csv # Basic information about states, including notes about the data.</li>
<li>us_census_2018_population_estimates_states.csv # population data of each states</li>
</ul>
<p>This project uses the following packages to achieve the analysis:</p>
<table>
<thead>
<tr class="header">
<th align="left">packages</th>
<th align="left">functions</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">data.table</td>
<td align="left">data readin</td>
</tr>
<tr class="even">
<td align="left">tidyverse</td>
<td align="left">data wrangling</td>
</tr>
<tr class="odd">
<td align="left">dplyr</td>
<td align="left">data transforming</td>
</tr>
<tr class="even">
<td align="left">plotly</td>
<td align="left">generate interactive graphics</td>
</tr>
<tr class="odd">
<td align="left">DT</td>
<td align="left">render R data frames to tables</td>
</tr>
<tr class="even">
<td align="left">knitr</td>
<td align="left">render rmarkdown files to htmls, pdfs, etc.</td>
</tr>
<tr class="odd">
<td align="left">stringr</td>
<td align="left">handle strings in R</td>
</tr>
<tr class="even">
<td align="left">sjPlot</td>
<td align="left">render data frames to publish-style tables</td>
</tr>
<tr class="odd">
<td align="left">ggthemes</td>
<td align="left">themes for ggplot graphics</td>
</tr>
<tr class="even">
<td align="left">scales</td>
<td align="left">tool for graphics scaling</td>
</tr>
<tr class="odd">
<td align="left">ggwordcloud</td>
<td align="left">generate word cloud</td>
</tr>
<tr class="even">
<td align="left">ggpubr</td>
<td align="left">arrange ggplot objects</td>
</tr>
</tbody>
</table>
</div>
<div id="results" class="section level2">
<h2>Results</h2>
<div id="q1-condition-overview" class="section level3">
<h3>Q1: Condition Overview</h3>
<p>Here is the summary of the latest numbers about the total tests, accumulated confirmed case numbers, accumulated deaths, accumulated recoveries.</p>
<table>
<colgroup>
<col width="9%" />
<col width="12%" />
<col width="35%" />
<col width="23%" />
<col width="19%" />
</colgroup>
<thead>
<tr class="header">
<th align="right">date</th>
<th align="right">total tests</th>
<th align="right">accumulated confirmed case numbers</th>
<th align="right">accumulated recoveries</th>
<th align="right">accumulated deaths</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="right">20201117</td>
<td align="right">170315721</td>
<td align="right">11202899</td>
<td align="right">4293640</td>
<td align="right">239784</td>
</tr>
</tbody>
</table>
</div>
<div id="q2-pandemic-tendency" class="section level3">
<h3>Q2: Pandemic Tendency</h3>
<p>The tendency of COVID-19 can reflect how will this pandemic will proceed into the future. Is it getting better or worse? We can illustrate the pandemic tendency using three important variables: New COVID-19 Tests, New Positive Cases, New Death Cases. The result is shown in the following graph.</p>
<p><img src="tendency.png" width="100%" style="display: block; margin: auto;" /> We can see that we are doing more and more COVID-19 testing. And the line graph of new positive cases depicts the tendency of the COVID-19 pandemic. It shows there are more and more people getting infected with COVID-19. The curve indicates that the virus spreading is speeding up as the time goes. Therefore, we have not reached the turning point in which the actual condition gets better.</p>
<p>From the curve of the New death cases, we can see the new death cases is becoming flat and not increasing like new confirmed cases, plausibly indicating that COVID-19 virus is less harmful than before or we are more experienced to cure the disease. However, this may subject to many factors’ influences.</p>
</div>
<div id="q3-community-infection-status" class="section level3">
<h3>Q3: Community Infection Status</h3>
<p>Different age groups may have different susceptibility towards the virus due to their different immune levels. In addition, the virus infection may also differs with respect to races and ethnicities. We gathered the data from <a href="https://covid.cdc.gov" class="uri">https://covid.cdc.gov</a> to discover the infection status of different age groups or races and ethnicities.</p>
<p><img src="race.png" width="100%" style="display: block; margin: auto;" /></p>
<p>From the graph, we can find that the differences among races and ethnicities are large. White Non-Hispanic has both the highest infection rate and covid-death rate, whereas Asian Non-Hispanic, American Indian, Native Hawaiian has significantly low infection rate and death rate.</p>
<p><img src="age.png" width="100%" style="display: block; margin: auto;" /></p>
<p>This graph of infection of different age groups tells us the younger people have a higher infection rate than other populations. Nevertheless, the older people tends to be impacted seriously by the virus, thus leading to the higher death rate among the population. We add a smooth line to the differences using <code>loess</code> method.</p>
</div>
<div id="q4-population-infection-rate" class="section level3">
<h3>Q4: Population Infection Rate</h3>
<p>The population infection rate in different states can serve as an indicator of virus spreading level.</p>
<p><img src="infection_rate.png" width="100%" style="display: block; margin: auto;" /></p>
<p>From the map, we can see that different states have different population infection rate now. Some states such as are serious than others. Another thing to pay attention to is that the population infection rate has reached a significant level of .</p>
</div>
<div id="q5-key-information" class="section level3">
<h3>Q5: Key Information</h3>
<p>The COVID Tracking Project also gathers notes from every state. These notes are informative for us to know about what is happing with the COVID-19 status with the state. In other words, we can catch the latest and most important information reading these notes. Simple text-mining like n-grams can give us a rough topic about the pandemic condition. Here, we choose tri-grams.</p>
<p><img src="wordcloud.png" width="100%" style="display: block; margin: auto;" /></p>
<p>From the statistics of tri-grams, we can infer that “PCR test” is the most important information across all states. It means that most states is mainly focusing on COVID-19 testing now.</p>
</div>
</div>
<div id="conclusion-and-summary" class="section level2">
<h2>Conclusion and Summary</h2>
<p>How is the COVID-19 condition in the United States now?</p>
<p>To begin with, the COVID-19 condition is not optimistic now, we can see the huge numbers of cases in the overview part. Firstly, the infection is still continuously growing with a higher and higher growth rate. Secondly, the virus infections and impacts are different in terms of different age groups and races/ethnicities. Thirdly, The population infection rate is different across different states and has already reached a significant level of around 2% now. Finally, most states are mainly working on doing COVID-19 testing now, this is the key information we should pay attention to.</p>
</div>
</section>



<!-- code folding -->


<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

</body>
</html>
